NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fusing Reward and Dueling Feedback in Stochastic Bandits

Wang, Xuchuang; Zeng, Qirun; Zuo, Jinhang; Liu, Xutong; Hajiesmaili, Mohammad; Lui, John; Wierman, Adam (July 2025, ICML)

Free, publicly-accessible full text available July 25, 2026
Stochastic Bandits Robust to Adversarial Attacks

Wang, Xuchuang; Zuo, Jinhang; Liu, Xutong; Lui, John; Hajiesmaili, Mohammad (April 2025, ICLR)

Free, publicly-accessible full text available April 28, 2026
Combinatorial Logistic Bandits

https://doi.org/10.1145/3726854.3727279

Liu, Xutong; Dai, Xiangxiang; Wang, Xuchuang; Hajiesmaili, Mohammad; Lui, John CS (June 2025, ACM)

Free, publicly-accessible full text available June 9, 2026
Stochastic Bandits Robust to Adversarial Attacks

Wang, Xuchuang; Liu, Maoli; Zuo, Jinhang; Liu, Xutong; Lui, John; Hajiesmaili, Mohammad (January 2025, Proceedings of the Thirteenth International Conference on Learning Representations.)

Free, publicly-accessible full text available January 22, 2026
Asynchronous Multi-Agent Bandits: Fully Distributed vs . Leader-Coordinated Algorithms

https://doi.org/10.1145/3711696

Wang, Xuchuang; Chen, Yu-Zhen Janice; Liu, Xutong; Yang, Lin; Hajiesmaili, Mohammad; Towsley, Don; Lui, John CS (March 2025, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

We study the cooperative asynchronous multi-agent multi-armed bandits problem, where each agent's active (arm pulling) decision rounds are asynchronous. That is, in each round, only a subset of agents is active to pull arms, and this subset is unknown and time-varying. We consider two models of multi-agent cooperation, fully distributed and leader-coordinated, and propose algorithms for both models that attain near-optimal regret and communications bounds, both of which are almost as good as their synchronous counterparts. The fully distributed algorithm relies on a novel communication policy consisting of accuracy adaptive and on-demand components, and successive arm elimination for decision-making. For leader-coordinated algorithms, a single leader explores arms and recommends them to other agents (followers) to exploit. As agents' active rounds are unknown, a competent leader must be chosen dynamically. We propose a variant of the Tsallis-INF algorithm with low switches to choose such a leader sequence. Lastly, we report numerical simulations of our new asynchronous algorithms with other known baselines.
more » « less
Free, publicly-accessible full text available March 6, 2026
Asynchronous Multi-Agent Bandits: Fully Distributed vs. Leader-Coordinated Algorithms

https://doi.org/10.1145/3726854.3727272

Wang, Xuchuang; Chen, Yu-Zhen Janice; Liu, Xutong; Yang, Lin; Hajiesmaili, Mohammad; Towsley, Don; Lui, John CS (June 2025, ACM)

Free, publicly-accessible full text available June 9, 2026
Best Arm Identification with Quantum Oracles

Wang, Xuchuang; Chen, Yu-Zhen; Guedes_de_Andrade, Matheus; Allcock, Jonathan; Hajiesmaili, Mohammad; Lui, John; Towsley, Don (February 2025, AAAI)

Free, publicly-accessible full text available February 10, 2026
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

Liu, Xutong; Wang, Siwei; Zuo, Jinhang; Zhong, Han; Wang, Xuchuang; Wang, Zhiyong; Li, Shuai; Hajiesmaili, Mohammad; Lui, John CS; Chen, Wei (July 2024, In Proceedings of the 41st International Conference on Machine Learning (ICML))

Full Text Available
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

Liu, Xutong; Wang, Siwei; Zuo, Jinhang; Zhong, Han; Wang, Xuchuang; Wang, Zhiyong; Li, Shuai; Hajiesmaili, Mohammad; Lui, John; Chen, Wei (June 2024, Proceedings of the 41st International Conference on Machine Learning)

Full Text Available
Variance-Adaptive Algorithm for Probabilistic Maximum Coverage Bandits with General Feedback

https://doi.org/10.1109/INFOCOM53939.2023.10228940

Liu, Xutong; Zuo, Jinhang; Xie, Hong; Joe-Wong, Carlee; Lui, John C.S. (May 2023, IEEE INFOCOM 2023 - IEEE Conference on Computer Communications)

Full Text Available

« Prev Next »

Search for: All records